parallel passage
Intertextual Parallel Detection in Biblical Hebrew: A Transformer-Based Benchmark
Identifying parallel passages in biblical Hebrew (BH) is central to biblical scholarship for understanding intertextual relationships. Traditional methods rely on manual comparison, a labor-intensive process prone to human error. This study evaluates the potential of pre-trained transformer-based language models, including E5, AlephBERT, MPNet, and LaBSE, for detecting textual parallels in the Hebrew Bible. Focusing on known parallels between Samuel/Kings and Chronicles, I assessed each model's capability to generate word embeddings distinguishing parallel from non-parallel passages. Using cosine similarity and Wasserstein Distance measures, I found that E5 and AlephBERT show promise; E5 excels in parallel detection, while AlephBERT demonstrates stronger non-parallel differentiation. These findings indicate that pre-trained models can enhance the efficiency and accuracy of detecting intertextual parallels in ancient texts, suggesting broader applications for ancient language studies.
- North America > United States > New York (0.04)
- Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
Towards Inference-Oriented Reading Comprehension: ParallelQA
Wadhwa, Soumya, Embar, Varsha, Grabmair, Matthias, Nyberg, Eric
In this paper, we investigate the tendency of end-to-end neural Machine Reading Comprehension (MRC) models to match shallow patterns rather than perform inference-oriented reasoning on RC benchmarks. We aim to test the ability of these systems to answer questions which focus on referential inference. We propose ParallelQA, a strategy to formulate such questions using parallel passages. We also demonstrate that existing neural models fail to generalize well to this setting.
- Africa > Malawi (0.48)
- Europe > United Kingdom > England > Greater London > London > Wimbledon (0.04)
- Asia > North Korea > Pyongyang > Pyongyang (0.04)
- (7 more...)
- Leisure & Entertainment > Sports > Tennis (1.00)
- Government > Voting & Elections (1.00)
- Government > Regional Government (1.00)
Identification of Parallel Passages Across a Large Hebrew/Aramaic Corpus
Shmidman, Avi, Koppel, Moshe, Porat, Ely
We propose a method for efficiently finding all parallel passages in a large corpus, even if the passages are not quite identical due to rephrasing and orthographic variation. The key ideas are the representation of each word in the corpus by its two most infrequent letters, finding matched pairs of strings of four or five words that differ by at most one word and then identifying clusters of such matched pairs. Using this method, over 4600 parallel pairs of passages were identified in the Babylonian Talmud, a Hebrew-Aramaic corpus of over 1.8 million words, in just over 11 seconds. Empirical comparisons on sample data indicate that the coverage obtained by our method is essentially the same as that obtained using slow exhaustive methods. INTRODUCTION Ancient text corpora in classical languages such as Greek, Latin, Hebrew and Aramaic typically include numerous examples of text reuse, including repetitions of long passages of 20 words or more. Identifying such passages is important because it allows scholars to trace the development of ideas and concepts through time and across geographical ranges. Additionally, even within a given time period and geographical location, the identification of multiple parallel sources for any given idea provides a platform for scholarly inquiry.
- Europe > Germany > Saxony > Leipzig (0.04)
- Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)